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CLAIMS 



[Claim(s)] 

[Claim 1] The multimedia public telephone system which has the voice and the I/O means 
of an image which are characterized by providing the following, and means of 
communications, and enabled the telephone call using voice and the image mutually. A 
sound signal receiving means to receive a telephone call person's voice through 
communication system. The voice / transliteration switch which chooses whether the 
aforementioned sound signal is outputted in an output or a title by voice as it is. The voice 
recognition unit which recognizes the sound signal which received when the 
aforementioned switch chooses a title output. Speech recognition generation equipment 
which changes into an alphabetic data the voice which the aforementioned voice 
recognition unit has recognized, and is displayed in a title on the aforementioned display. 
[Claim 2] The multimedia public telephone system which has the voice and the I/O means 
of an image which are characterized by providing the following, and means of 
communications, and enabled the telephone call using voice and the image mutually. The 
character / voice conversion switch which chooses whether a telephone call person's voice 
is transmitted to a telephone call partner as it is, or it transmits in a character. Text 
understanding equipment which carries out recognition generation of the character string 
which the telephone call person inputted as a style by character recognition when the 
aforementioned switch chooses character transmission. The voice synthesizer which 
changes into a voice wave the text which the aforementioned text understanding equipment 
generated by speech synthesis. The voice inverter which changes into a sound signal the 
voice wave which the aforementioned voice synthesizer compounded, and transmits to a 
telephone call partner through communication system. 

DETAILED DESCRIPTION 



[Detailed Description of the Invention] 
[0001] 

[The technical field to which invention belongs] this invention relates to the multimedia 
public telephone system which enabled it to transmit and receive an image etc. other than 
voice in a public telephone system. 
[0002] 

[Description of the Prior Art] Functions, such as having that conversation is made 
simultaneously, looking at a partner's face, having a FAX function, having the 
circumference map and the information guidance function of an installation, and a 
memorandum column function with the image I/O device which becomes an image input 
by CCD camera 1 and an image output by the screen 2, as the present multimedia public 
telephone system is shown in drawing 12 , are added, and various kinds of functions, such 
as that a weather report can hear further and ordering the information on each event, are. As 
for the operation input of this public telephone, a push button 3 and a pointing device (a pen 
input and touch screen method) 4 are used. 



[0003] 

[Problem(s) to be Solved by the Invention] The following technical problems occur in the 
conventional multimedia public telephone system. 

[0004] (1) In the case of a hypacousis person, conversation cannot be heard even if it can 
see a telephone call person's face. 

[0005] (2) A telephone call person's voice (the content of conversation) cannot be read as a 
character. 

[0006] (3) When conversation is a difficult hypacousis person, in order for there to be 
mogilalia, and to tell the content of the talk by the memorandum function etc., the content 
expressed in writing in written form cannot be told to a partner with voice. 
[0007] (4) A healthy person has not heard wrong or caught [ all busy and ], either, and may 
interrupt the talk. 

[0008] As mentioned above, even if the conventional system has various multimedia 
functions, the range which can be used for a hearing-impaired person is restricted. 
[0009] The purpose of this invention is to provide a hearing-impaired person etc. with the 
multimedia public telephone system which improved user-friendliness. 
[0010] 

[Means for Solving the Problem] In order that this invention may solve the 
above-mentioned technical problem, it adds the SUS to the conventional function, carries 
out transform processing of a partner's voice to a title, and in order to make it display on a 
display and to also make the handicap of a language system ease further, it is the thing 
changes into voice the character described to the memorandum by adding a 
speech-understanding function to a memorandum function, and it enabled it to transmit to a 
telephone call person, and it carries out the following composition as the feature. 
[001 1] In the multimedia public telephone system which has voice, the I/O means of an 
image, and means of communications, and enabled the telephone call using voice and the 
image mutually (1st invention) When a sound signal receiving means to receive a telephone 
call person's voice through communication system, the voice / transliteration switch which 
chooses whether the aforementioned sound signal is outputted in an output or a title by 
voice as it is, and the aforementioned switch choose a title output, It is characterized by 
having the voice recognition unit which recognizes the sound signal which received, and 
speech recognition generation equipment which changes into an alphabetic data the voice 
which the aforementioned voice recognition unit has recognized, and is displayed in a title 
on the aforementioned display. 

[0012] In the multimedia public telephone system which has voice, the I/O means of an 
image, and means of communications, and enabled the telephone call using voice and the 
image mutually (2nd invention) When the character / voice conversion switch which 
chooses whether a telephone call person's voice is transmitted to a telephone call partner as 
it is or it transmits in a character, and the aforementioned switch choose character 
transmission, The text understanding equipment which carries out recognition generation of 
the character string which the telephone call person inputted as a style by character 
recognition, It is characterized by having the voice synthesizer which changes into a voice 
wave the text which the aforementioned text understanding equipment generated by speech 
synthesis, and the voice inverter which changes into a sound signal the voice wave which 



the aforementioned voice synthesizer compounded, and transmits to a telephone call partner 

through communication system. 

[0013] 

[Embodiments of the Invention] DrawinR 1 and drawing 2 are the block diagrams of the 
multimedia public telephone system in which the operation gestalt of this invention is 
shown, and match and show the processing process of each part to drawing 3 - drawing 6 
with the same sign. Hereafter, the composition and processing of each part are explained in 
detail. 

[0014] (1) When a partner's voice is received by title display ( drawing 1 ). 

[0015] An audio input unit 1 1 is for inputting voice from the microphone of the earphone 

carried in a public telephone, and generates the signal into which the telephone call person 

inputted voice from the earphone. The normal-mode-rejection filter 12 removes the noise 

which is indistinguishable and enters into conversation by the noise filter. 

[0016] By the ATM switching system which controls B-ISDN and the network of 2B+D 

with high speed and wide band nature for example, by the digital circuit, communication 

system 1 3 performs communications control and transmits the sound signal from an audio 

input unit 1 1 to the voice receiving set 14. 

[0017] An addressee chooses whether voice / transliteration switch 15 is changed into 
whether the voice (analog signal) which received is received as voice as it is, and a 
character (digital signal), and is displayed on a display. 

[0018] Speech-understanding equipment 16 recognizes and understands receiving voice, 
when selection operation of a switch 15 is made into a transliteration. In order that this 
speech-understanding equipment 16 may recognize and understand the voice which 
received, it is constituted by the system which performs speech recognition by the word and 
the syntactic and semantic analysis after the feature extraction which removes the 
ambiguous language by the voice analysis and the fuzzy control as technology which can 
respond to a speaker independence, and is equipped with the speech-recognition processing 
section 161, the speech-synthesis processing section 162, and a voice pattern dictionary 163. 
[0019] With this equipment 16, as drawing 3 and the processing flow of drawing 4 show, 
after carrying out A/D-conversion processing of the inputted sound signal, frequency 
spectrum analyzes a voice input signal roughly with a voice analysis, the feature parameter 
of a sound signal is extracted and changed from a voice-analysis result by the feature 
extraction at time series, and segmentation processing performs the segmentation to a voice 
unit. 

[0020] a feature extraction - "-- like " and "**-", and well [ "well" ], it obtains, and fuzzy 
reasoning extracts a redundancy-word, it is removed, and when people and a man talk, only 
a really required word is extracted Furthermore, the daily required word is registered 
beforehand, and by the learning function, gradually, it recognizes and adds and it goes a 
behind needed word. . 

[0021] Subsequently, a phoneme sequence is acquired by performing word recognition by 
comparison with a voice standard pattern by speech recognition, and a word is recognized 
by collating with word collating and the word standard pattern which it has in the 
knowledge base about a phoneme sequence by word recognition. If a standard pattern does 
not exist in this recognition, study processing (recognition and registration) is performed 



and word knowledge is added to the knowledge base. 

[0022] Subsequently, if it analyzes whether there is any error in syntax by collating with the 
syntax pattern of the knowledge base by syntax collating and syntax recognition and there 
is an error about the recognized word, syntax recognition will be performed by re-verifying. 
Furthermore, it investigates whether it is semantically appropriate about the word syntax 
recognized by a semantic analysis and semantic recognition, and syntax-analysis processing 
and semantic-analysis processing are repeated until an appropriate result is obtained. 
Characters convertible [ with transliteration processing ] including whether to be 
convertible for a character at these analyses are changed into the kanji or kana. 
[0023] Returning to drawing 1 , speech recognition generation equipment 17 changes into 
the alphabetic data of text form the voice data the content of receiving voice has been 
recognized to be by speech-understanding equipment 16. A screen display of this 
alphabetic data is carried out to the screen 2 of a public telephone in a title with the display 
display 18. 

[0024] Therefore, a telephone call person's conversation can be read as a character also by a 
hypacousis person's case by speech-understanding equipment's performing speech 
recognition for the voice which received, and displaying on a screen by making this into a 
character by composition of drawing 1 . 

[0025] Moreover, risk aversion, like a healthy person also misses by title display during 
communication by telephone is made possible. 

[0026] (2) When transmitting with voice against a character input ( drawing 2 ). 
[0027] A character / voice conversion switch 1 9 chooses a character button by the operation, 
when communicating from the content expressed in writing in written form by the 
memorandum function, in order that the addressee of conversation may answer a 
transmitting person. 

[0028] If the character button of a switch 19 is pushed, the character input unit 20 will carry 
out a screen display of the memo pad (text form), and will enable the input of the content of 
the talk by the POINTITENGU device (electronic pen). 

[0029] Text understanding equipment 21 analyzes the character string which carried out the 
handwriting input, recognizes it as a text, and is equipped with the style recognition 
processing section 21 1, the style formal analysis section 212, the character-pattern 
recognition dictionary 213, and the style recognition generation section 214. 
[0030] With this text understanding equipment 21, if the character recognition processing 
section 211 analyzes a character from a character pattern and style form to a handwriting 
character input and the content of an input differs from the content of analysis as a 
processing flow is shown in drawing 5 , reinput will be received as correction, the content 
of an input of a handwriting character is recognized as a text, and the style recognition 
generation section 214 generates as a text. 

[0031] A voice synthesizer 22 changes into a sound signal the text by which recognition 
generation was carried out with text understanding equipment 21 by speech synthesis, and 
is equipped with the speech-synthesis-system section 221 and the voice pattern dictionary 
222. 

[0032] A voice synthesizer 22 synthesizes voice with a rule composite system, it analyzes 
reading, a word, a clause boundary, etc. by the syntax analysis which referred to the 



Japanese dictionary to the kanji kana mixture sentence, the semantic analysis, etc., 
determines each parameter of the duration of the accent of a phoneme sequence, intonation, 
and a phoneme with reference to a voice pattern dictionary from this analysis result, and 
synthesizes voice considering these as a parameter of a vocal tube model. 
[0033] The voice transducer 23 changes a sound signal wave into voice, this conversion — 
setting — a telephone call person — he chooses male voice or female voice on a system, and 
transmits with voice the voice which pushed and compounded the transmitting button to a 
partner through communication system 24 Direct voice or the compound voice is received 
in a partner's multimedia public telephone 25. 

[0034] Therefore, by changing this into voice by text understanding and speech synthesis 
by carrying out a character input from a screen, and sending to the other party by making 
this into a sound signal by composition of drawinR 2 , even when conversation is a difficult 
hypacousis person, in order for there to be mogilalia, and to tell the content of the talk by 
the memorandum function etc., the content expressed in writing in written form can be told 
to a partner with voice. 

[0035] Drawing 7 is the example of installation of a character / voice conversion switch. 
This switch is a title button for displaying a title, and a character input button for inputting a 
character and making it change into voice. Drawing 8 is an example of a screen layout. This 
screen consists voice of four of the voice modes 34 in which male voice chooses female 
****, when transmitting the transmitting content text screen 33 for inputting a character 
into the image section 31 besides a multimedia function, the receiving content text screen 
32 which changed voice to the title, and a partner, and transmitting the content to them with 
voice, and the inputted content. 

[0036] In addition, also let this invention be the multimedia public telephone system which 
it considers as the multimedia public telephone system carrying the equipment of both the 
composition of drawing 1 , or the composition of drawing 2 , and also carries independently 
only one equipment of drawing 1 or drawing 2 . 

[0037] Drawing 9 shows the example of interface composition of a multimedia public 
telephone system with both the function to receive a partner's voice by title display, and the 
function transmitted with voice against a character input. 

[0038] The voice input section of this drawing inputs voice into drawing 10 from a 
microphone, as procedure shows (SI). Remove the noise mixed with voice by the noise 
filter (S2), and clear voice is changed into a digital signal on a voice input board (S3). 
Speech recognition of the learning function by the NYURRU network is had and carried 
out to the basis of control of a voice input driver. (S4), A redundant signal is removed to the 
sound signal recognized by reasoning of the fuzzy reasoning section (S5). When there is no 
sound signal recognized with the knowledge base, the knowledge base file, and the 
automatic-programming editor in a knowledge base file, it memorizes by performing study 
processing (S6). The result sound signal processing has finally been completed and 
recognized to be by window control is displayed on the editor application screen of a 
window (S7). 

[0039] As procedure shows to drawing 1 1 , in order to carry out a character input with a 
text editor and a pen (SI 1) and to recognize the character input section of drawing 9 by 
pattern processing of the inputted character, By the character-pattern recognition section, 



the study section, the exclusive knowledge base, and general-purpose ******-SUEDITA, 
the word which is not memorized by the pattern or database (db) of the inputted character is 
learned and memorized, and it recognizes by the pattern of a character being distinguished 

(51 2) . Subsequently, in order to change the recognized character (digital) into voice 
(analog), it synthesizes voice in male voice or female voice by the speech synthesis system 

(51 3) , and the result which completed speech synthesis processing is outputted in the voice 
output section (microphone) (SI 4). 

[0040] 

[Effect of the Invention] Since add the SUS to the conventional multimedia public 
telephone function, and carry out transform processing of a partner's voice to a title, it is 
made to display on a display, the character described to the memorandum by adding a 
speech-understanding function to a memorandum function further is changed into voice and 
it enabled it to transmit to a telephone call person according to this invention the above 
passage, there are the following effects. 

[0041] (1) It enables the use range of a multimedia function to extend. 

[0042] (2) Since all busy and the point that it is thought that it is important can be displayed 

in a title, it becomes possible to avoid risks, such as a hearing it difference. 

[0043] (3) It becomes possible to extend various information ranges with conversation, 

such as a telephone. 

[0044] (4) The conversation of a hearing-impaired person is attained by telephone by the 
title or the voice conversion function. 

[0045] (5) When inputting a character, changing into voice and transmitting to a partner, 
male voice or female **** can be chosen. 



DESCRIPTION OF DRAWINGS 



[Brief Description of the Drawings] 

[Drawing 1] The block block diagram of the multimedia public telephone system in which 
the operation gestalt of this invention is shown (the 1). 

[Drawing 2] The block block diagram of the multimedia public telephone system in which 

the operation gestalt of this invention is shown (the 2). 

[Drawing 3] The processing flow in drawing 1 (the 1). 

[Drawing 4] The processing flow in drawing 1 (the 2). 

[Drawing 5] The processing flow in drawing 2 (the 3). 

[Drawing 61 The processing flow in drawing 2 (the 4). 

[Drawing 7] The example of switch installation in an operation gestalt. 

[Drawing 8] The example of a screen layout in an operation gestalt. 

[Drawing 9] The example of interface composition in an operation gestalt. 

[Drawing 10] Procedure of the voice input section in drawing 9 . 

[Drawing 11] Procedure of the character input section in drawing 9 . 

[Drawing 12] Drawing of the present multimedia public telephone. 

[Description of Notations] 



11— Audio input unit 

12 — Normal-mode-rejection noise filter 

13 24 ~ Communication system 

14 — Voice receiving set 

15 ~ Voice / transliteration switch 

16 ~ Voice recognition unit 

17 -- Speech recognition generation equipment 

1 8 ~ Display display 

19 — A character / voice conversion switch 

20 -- Character input unit 

21 ~ Text understanding equipment 

22 ~ Voice synthesizer 

23 ~ Voice inverter 

25 — Multimedia public telephone 



